Hartley & Zisserman, Chapter 9

Epipolar Geometry & the Fundamental Matrix

The geometry of two views. Epipoles, epipolar lines, the fundamental matrix F, the essential matrix E, and how they constrain point correspondences between two images.

Prerequisites: Chapter 6 (Camera Models) + Chapter 8 (Single View Geometry).

Chapters

Simulations

Chapter 0: Why Epipolar Geometry?

Suppose you see a point x in one image. Where can its correspondence x' appear in a second image? Without any knowledge of the 3D point, x' could be anywhere in the second image — or could it?

No. The point x back-projects to a ray through the first camera centre. The second camera sees this ray as a line in its image. The correspondence x' must lie on this line. This line is the epipolar line of x.

The epipolar constraint: Given a point in one image, its correspondence in the other image is constrained to lie on a single line — the epipolar line. This reduces the search for correspondences from a 2D problem to a 1D problem. The matrix that encodes this constraint is the fundamental matrix F.

Epipolar Geometry

A point in the left image constrains its match to an epipolar line in the right image. All epipolar lines pass through the epipole.

Given a point x in one image, where is its correspondence x' constrained to lie in the second image?

On the epipolar line l' = Fx Anywhere in the image At the same pixel coordinates

Chapter 1: Epipolar Geometry

The geometry is defined by the two camera centres C and C', and a 3D point X. These three points define a plane called the epipolar plane.

Entity	Definition
Epipolar plane	The plane through C, C', and X
Baseline	The line joining C and C'
Epipole e	Image of C' in the first camera (where the baseline pierces the first image plane)
Epipole e'	Image of C in the second camera
Epipolar line l	Intersection of epipolar plane with first image plane
Epipolar line l'	Intersection of epipolar plane with second image plane

Key property: As X varies, the epipolar plane rotates around the baseline. This sweeps out a pencil of epipolar lines in each image, all passing through the respective epipole. The epipoles are the fixed points of this pencil.

The correspondence between epipolar lines l ↔ l' is a 1D projective transformation (a homography of the pencil) with 3 degrees of freedom.

What is the epipole e in the first image?

The image of the second camera centre C' in the first camera The principal point of the first camera The vanishing point of the baseline

Chapter 2: The Fundamental Matrix F

The fundamental matrix F is a 3×3 matrix of rank 2 that algebraically encodes the epipolar geometry. For any pair of corresponding points x ↔ x':

x'^T F x = 0

This is the correspondence condition. It says that x' lies on the epipolar line l' = Fx, and equivalently that x lies on l = F^Tx'.

Deriving F from a homography: Pick any plane in the scene. It induces a homography H mapping points in the first image to the second. For any corresponding pair, x' lies on the line through Hx and e'. Therefore: F = [e']_× H. The fundamental matrix is the product of a cross-product matrix (encoding the epipole) and a homography.

F maps points to lines: x → l' = Fx. This makes it a correlation (not a full projective transformation), which is why it has rank 2, not 3.

The fundamental matrix F has what rank?

Rank 2 Rank 3 Rank 1

Chapter 3: Properties of F

Property	Formula
Correspondence condition	x'^T F x = 0
Epipolar line from x	l' = Fx
Epipolar line from x'	l = F^Tx'
Epipole e' (left null-vector)	e'^TF = 0
Epipole e (right null-vector)	Fe = 0
Transpose symmetry	F for (P, P') ⇒ F^T for (P', P)
Degrees of freedom	7 (9 entries − 1 scale − 1 for det F = 0)

Counting DOF: F has 7 DOF. Alternatively: 2 for e, 2 for e', and 3 for the 1D homography mapping the pencil of epipolar lines between the two images. Total: 2 + 2 + 3 = 7.

F is not invertible (rank 2), so it defines a mapping from points to lines but not from points to points. It is not a projective transformation.

How many degrees of freedom does the fundamental matrix F have?

7 8 9

Chapter 4: F from Camera Matrices

Given two camera matrices P and P', the fundamental matrix is:

F = [e']_× P' P⁺

where P⁺ is the pseudo-inverse of P (PP⁺ = I) and e' = P'C is the epipole (the image of the first camera centre C in the second camera).

For canonical cameras: If P = [I | 0] and P' = [M | m], then:
F = [e']_× M = [m]_× M
where e' = m. This is because the first camera centre is the origin, and its image through P' is P'(0,0,0,1)^T = m.

For a calibrated stereo rig P = K[I | 0] and P' = K'[R | t]:

F = K'^−T [t]_× R K⁻¹

Note: F is defined only when the camera centres are distinct. If C = C', then e' = P'C = 0 and F = 0 (the zero matrix).

For canonical cameras P = [I|0] and P' = [M|m], what is the epipole e'?

e' = m (the last column of P') e' = the first column of M e' = (0, 0, 1)^T

Chapter 5: Cameras from F

Given only F (computed from image correspondences), we can recover a pair of camera matrices up to a projective ambiguity. Set P = [I | 0], then:

P' = [[e']_× F + e' v^T | λ e']

for arbitrary vector v and scalar λ. In the simplest form (v = 0, λ = 1): P' = [[e']_× F | e'].

Projective ambiguity: From F alone, we can only recover cameras and structure up to a 4×4 projective transformation. We cannot determine the true Euclidean reconstruction without additional information (calibration, scene constraints, etc.).

Any matrix F of rank 2 is the fundamental matrix of some pair of cameras. This means F defines a valid two-view geometry, and camera matrices can always be extracted.

From F alone, to what level of ambiguity can we recover cameras and structure?

Up to a projective transformation (a 4×4 matrix H) Up to a similarity transformation Exactly (no ambiguity)

Chapter 6: The Essential Matrix

The essential matrix E is the specialization of F to calibrated cameras. If normalized image coordinates x̂ = K⁻¹x are used:

E = K'^T F K ⇒ x̂'^T E x̂ = 0

E has the form E = [t]_× R, where R is the rotation and t is the translation between cameras. It has only 5 DOF (3 for R + 2 for the direction of t, since overall scale is unrecoverable).

The essential matrix constraint: E's two non-zero singular values are equal. This is the additional constraint beyond det E = 0 that distinguishes E from a general F. So E satisfies 2 constraints (det = 0 and equal singular values), leaving 5 DOF from the original 8 of a 3×3 homogeneous matrix.

From E, the rotation R and translation direction t can be extracted (up to a 4-fold ambiguity). The chirality constraint (all points must have positive depth in both cameras) selects the unique correct solution.

How many degrees of freedom does the essential matrix E have?

5 (3 rotation + 2 translation direction) 7 (same as F) 6

Chapter 7: Pure Translation

When the camera undergoes a pure translation (no rotation, no change in K), the fundamental matrix takes a special form: F = [e]_×, which is skew-symmetric and has only 2 DOF.

In this case e = e' = v, where v is the vanishing point of the translation direction (the Focus of Expansion). All epipolar lines radiate from v. Points appear to "flow" outward from v, and closer points move faster than distant ones.

Auto-epipolar property: For pure translation, each point x lies on its own epipolar line (since x, x', and e = e' are collinear). This is because the apparent motion of each point is along the line from x to the vanishing point. Motion parallax (closer objects move more) is the cue used by biological vision for depth perception.

If the translation is along the x-axis, e = (1, 0, 0)^T and the epipolar lines are horizontal scan lines: y' = y. This is the ideal configuration for stereo matching.

For pure translational camera motion, the fundamental matrix is:

F = [e]_×, a skew-symmetric matrix with 2 DOF The same as any general F with 7 DOF The zero matrix

Chapter 8: Planar Motion

Planar motion occurs when the rotation axis is perpendicular to the translation direction (e.g., a car moving forward while turning). This imposes an extra constraint: the symmetric part of F has rank 2 (det F_s = 0).

Why it matters: Planar motion arises in many practical settings — ground vehicles, turntables, and robots moving on a floor. The reduced DOF (6 instead of 7) means fewer correspondences are needed, and the special structure can be exploited for more robust estimation.

The fundamental matrix for planar motion can be parameterized as F = [e']_× [l_s]_× [e]_×, which automatically satisfies both the rank-2 constraint on F and the rank-2 constraint on F_s.

General camera motion can be decomposed as: first apply a correction H = K'RK⁻¹ (the infinite homography) to undo rotation and calibration differences, then the residual is a pure translation. So F = [e']_× H_∞.

What additional constraint does planar motion impose on F?

The symmetric part F_s = (F + F^T)/2 must also have rank 2 F must be symmetric F must be skew-symmetric

Chapter 9: Connections

The fundamental matrix is the most important multi-view geometric object in this book. Everything in Part II flows from it.

Chapter	How F connects
Ch 10: Reconstruction	F determines cameras (up to projective ambiguity) ⇒ enables 3D reconstruction
Ch 11: Computing F	Practical algorithms: 8-point, normalized, RANSAC
Ch 12: Triangulation	F provides the epipolar constraint; triangulation finds 3D points
Ch 13: Homographies	F = [e']_× H relates homographies and the fundamental matrix
Ch 15: Trifocal Tensor	Three-view generalization: the trifocal tensor encodes what F does for two views

"The fundamental matrix is the algebraic representation of epipolar geometry."

— Hartley & Zisserman, Chapter 9

What is the key relationship between F and a plane-induced homography H?

F = [e']_× H — F is the cross-product of the epipole with the homography F = H F = H⁻¹

← Chapter 8 Chapter 10: Reconstruction →