Hartley & Zisserman, Chapter 13

Scene Planes & Homographies

Plane-induced homographies, computing F from a homography, the infinite homography H, parallax, and the deep relationship between homographies and epipolar geometry.

Prerequisites: Chapter 9 (Epipolar Geometry) + Chapter 11 (Computing F).
10
Chapters
4+
Simulations

Chapter 0: Why Homographies?

A plane in 3D induces a homography between two views: every point on the plane maps to a unique point in each image, and the two image points are related by a 3×3 invertible matrix H. Points not on the plane deviate from this homography — this deviation is the parallax.

The key relationship: F = [e']× H. The fundamental matrix is the cross-product of the epipole with any plane-induced homography. This connects homographies and epipolar geometry, and provides powerful tools for reconstruction, plane detection, and motion analysis.
What geometric entity in 3D induces a homography between two views?

Chapter 1: A Plane Induces a Homography

Let π = (nT, d)T be a plane in 3D. For cameras P = K[I | 0] and P' = K'[R | t], points X on π satisfy nTX + d = 0, giving X = −(d/nTX) X. The resulting homography is:

H = K'(R − t nT/d) K−1

This maps points in image 1 to points in image 2 for all 3D points lying on the plane π.

Every plane gives a different H, but they all satisfy F = [e']×H. The family of homographies induced by all planes in the scene is parameterized by the plane equation (n, d). Each shares the same epipole e' and the same F.
For a plane π = (nT, d)T, the induced homography depends on:

Chapter 2: H from Camera Matrices

Given camera matrices P, P' and a plane π, the homography is H = P'(I − XπT)P+, where X is any point not on π. More practically, for canonical cameras P = [I|0] and P' = [M|m]:

H = M − m πT1:3 / π4
Given F and one additional correspondence (off the plane): H can be determined from F and a single point correspondence x ↔ x' of a point not on the plane. Specifically, H = [e']×F + e'vT, where v is chosen so that Hx = x'. This shows that F determines H up to a 3-DOF ambiguity, resolved by one off-plane correspondence.
How many additional correspondences (beyond F) are needed to determine a plane-induced homography?

Chapter 3: H from F and Correspondences

Given F and a set of correspondences known to lie on a plane, the plane-induced homography can be computed. The correspondences must satisfy both x' = Hx and x'TFx = 0.

Method: parameterize H = [e']×F + e'vT (3 DOF in v). Use 3 on-plane correspondences to solve for v. Then H is fully determined.

Key insight: The relationship F = [e']×H means that F and H are not independent. Given H (from a plane), F can be computed with just one additional off-plane correspondence. Given F, H can be computed with 3 on-plane correspondences.
Given F and 3 point correspondences on a plane, what can be computed?

Chapter 4: Computing F from H

If a plane-induced homography H is known (e.g., from 4 coplanar point correspondences), then F can be computed from H plus one additional off-plane correspondence.

Since F = [e']×H, we need to find e'. Given an off-plane correspondence x ↔ x', the point x' does not equal Hx (because x is not on the plane). But x' still lies on the epipolar line through e' and Hx. Therefore:

e' = x' × Hx

gives the direction of the epipole (one correspondence determines e' up to scale). Then F = [e']×H.

This is extremely useful in practice: When most of the scene is planar (e.g., a building facade), compute H from 4+ coplanar correspondences, then use a few off-plane correspondences to determine e' and hence F. This avoids the degeneracy of computing F from coplanar points.
Given a plane-induced homography H and one off-plane correspondence x ↔ x', how is the epipole e' found?

Chapter 5: The Infinite Homography H

The infinite homography H is the homography induced by the plane at infinity. For cameras P = K[I|0] and P' = K'[R|t]:

H = K'RK−1

H maps vanishing points in image 1 to vanishing points in image 2. It depends on rotation R and calibrations K, K', but not on translation t.

H separates rotation from translation: Given x' = K'RK−1x + K't/Z, the first term (Hx) captures rotation and calibration changes, while the second term (K't/Z) captures translation-induced parallax. Points at infinity have Z = ∞, so the second term vanishes: H maps them exactly.

H is crucial for: (1) computing the plane at infinity for affine reconstruction, (2) separating rotation from translation, and (3) image stabilization (removing rotation-induced motion).

What does the infinite homography H = K'RK−1 depend on?

Chapter 6: Parallax

For a point X not on the plane inducing H, the parallax vector is x' − Hx. This vector measures how much the point deviates from the plane-induced prediction. It is proportional to 1/depth and points towards the epipole e'.

Parallax encodes depth: Points on the plane have zero parallax. Points in front of the plane have parallax in one direction; points behind, in the opposite direction. The magnitude of parallax is inversely proportional to depth. This is the geometric basis of stereo vision.

The parallax vector x' − Hx is always directed towards (or away from) the epipole e'. The epipolar line through x' and Hx passes through e'. This is another manifestation of F = [e']×H.

For a point NOT on the homography-inducing plane, the parallax vector x' - Hx points towards:

Chapter 7: Decomposing H

Given a plane-induced homography H and the camera calibrations K and K', the rotation R, translation t, and plane normal n can be recovered (up to sign ambiguity).

The key relationship: H = K'(R − t nT/d) K−1. Normalizing: K'−1H K = R − t nT/d. This can be decomposed via SVD, giving two possible solutions (related by a sign flip of t and n).

The decomposition gives: (1) the camera rotation R, (2) the translation direction t (up to scale), and (3) the plane normal n and distance d. This is remarkably powerful: a single plane in two calibrated images reveals the full relative pose of the cameras.
What can be recovered by decomposing a plane-induced homography (with known K)?

Chapter 8: Plane Detection

Homographies can be used to detect planes in the scene. If a subset of correspondences is well-explained by a homography (small reprojection error after fitting H), those correspondences likely arise from a planar surface.

RANSAC for plane detection: Sample 4 correspondences, fit H, count inliers. The inlier set defines a plane. Repeat with the remaining correspondences to find multiple planes. This is the basis of multi-plane scene understanding.

The distinction between H-consistent and F-consistent correspondences also provides a test for the planar degeneracy: if all correspondences fit a homography well, the scene may be planar and F estimation will be unreliable.

How can homographies be used to detect planar surfaces in a scene?

Chapter 9: Connections

LinkConnection
Ch 2 → Ch 13Homographies of 2D (Ch 2) are the single-view case; plane-induced homographies add multi-view depth
Ch 10 → Ch 13H identifies π for affine reconstruction
Ch 11 → Ch 13F can be computed from H + off-plane correspondences, avoiding planar degeneracy
Ch 13 → Ch 18Plane-induced homographies simplify N-view reconstruction (known planes reduce DOF)
Ch 13 → Ch 19H = K'RK−1 is a key equation in auto-calibration
"The parallax is a measure of departure from a plane, and is the geometric basis of stereo vision."
— Hartley & Zisserman, Chapter 13
The fundamental relationship between F and a plane homography H is:
← Chapter 12 Chapter 15: Trifocal Tensor →