Hartley & Zisserman, Chapter 19

Auto-Calibration

Recovering camera intrinsics without a calibration target. The absolute dual quadric, Kruppa equations, stratified self-calibration, and special cases like rotating cameras and planar motion.

Prerequisites: Chapter 8 (Single View Geometry) + Chapter 10 (Reconstruction).

Chapters

Simulations

Chapter 0: Why Auto-Calibration?

Traditional camera calibration requires a known calibration target (a checkerboard, a grid of control points). But what if you have only images of an unknown scene, taken by an unknown camera? Can you still recover the camera's intrinsic parameters?

Yes. Auto-calibration (or self-calibration) recovers the camera calibration matrix K from multiple views of an arbitrary scene, without any calibration target. The only requirement is some constraint on the cameras — typically that they share the same K, or that certain parameters (zero skew, known aspect ratio) are constant.

The fundamental principle: The image of the absolute conic ω = (KK^T)⁻¹ depends only on K, not on camera pose. If K is constant across views, then ω is the same in every image. This consistency constraint, combined with the reconstructed cameras, determines K.

What makes auto-calibration possible without a calibration target?

The image of the absolute conic ω depends only on K (not on pose), so if K is constant, ω must be the same in all views The fundamental matrix encodes K directly The scene must contain known 3D structures

Chapter 1: Algebraic Framework

Start with a projective reconstruction (P_i, {X_j}) from multiple views. The goal is to find the 4×4 upgrading homography H that transforms this into a metric reconstruction:

P_i^metric = P_i H⁻¹ = K_i[R_i | t_i]

H has 15 DOF (a general 4×4 projective transformation). If all cameras share the same K (5 unknowns), then the constraints from m cameras give 5m equations for 15 + 5 = 20 unknowns. With m ≥ 3 cameras, this is (over)determined.

The two approaches:
(1) Direct: Estimate the absolute dual quadric Q*_∞ in the projective frame, then extract H.
(2) Stratified: First upgrade to affine (find π_∞), then upgrade to metric (find Ω_∞).

With m cameras sharing the same K, how many cameras are needed for auto-calibration?

At least 3 At least 2 At least 10

Chapter 2: The Absolute Dual Quadric Q*_∞

The absolute dual quadric Q*_∞ is a 4×4 symmetric matrix that encodes both the plane at infinity and the absolute conic. In a metric frame:

Q*_∞ = [I 0 ; 0^T 0]

In an arbitrary projective frame, Q*_∞ = H [I 0 ; 0^T 0] H^T. The key property linking Q*_∞ to calibration is:

ω*_i = P_i Q*_∞ P_i^T = K_iK_i^T

So the projection of Q*_∞ through camera P_i gives the dual IAC ω*_i = K_iK_i^T.

The linear method: Q*_∞ is a 4×4 symmetric matrix (10 entries, 9 DOF up to scale) with rank 3. Each camera with known K imposes constraints on Q*_∞ via ω*_i = P_i Q*_∞ P_i^T. With constant K (5 intrinsics), each camera provides up to 4 constraints. Three cameras give 12 constraints — enough to determine the 9 DOF of Q*_∞.

What is the relationship between Q*_∞ and the camera calibration K_i?

P_i Q*_∞ P_i^T = K_iK_i^T Q*_∞ = K_i Q*_∞ is independent of K

Chapter 3: The Kruppa Equations

An older approach to auto-calibration uses the Kruppa equations, which relate the dual IAC ω* directly to the fundamental matrix F between two views:

[e']_× ω'* [e']_× = F ω* F^T (up to scale, with appropriate e/e' terms)

Each pair of views provides 2 independent Kruppa equations (the constraint has 3 equations but only 2 are independent because of the scale ambiguity).

Kruppa vs. Q*_∞: The Kruppa approach works directly from fundamental matrices (no projective reconstruction needed). But it is harder to enforce rank constraints and more sensitive to degeneracies. The Q*_∞ approach is generally preferred for modern implementations because it provides a cleaner algebraic framework.

For a camera with constant K and zero skew, the Kruppa equations from 3 pairs of views give enough constraints to determine ω* and hence K.

How many independent Kruppa equations does each pair of views provide?

2 5 1

Chapter 4: Stratified Self-Calibration

The stratified approach performs auto-calibration in two steps:

Step	What	How
1. Projective → Affine	Find π_∞	Modulus constraint: ω* = K K^T is independent of camera pose. The plane at infinity is determined by the constraint that all ω*_i are consistent.
2. Affine → Metric	Find Ω_∞	Given the affine reconstruction, determine the absolute conic from the known structure of K.

The modulus constraint: For a camera with constant K, the infinite homography H_∞ = K R K⁻¹ satisfies H_∞^T ω = ω H_∞⁻¹ (since ω = (KK^T)⁻¹ is invariant). This means the singular values of K⁻¹H_∞K must be the singular values of a rotation matrix — i.e., all equal to 1. This is the "modulus constraint."

What is the stratified approach to auto-calibration?

First upgrade to affine (find π_∞), then upgrade to metric (find Ω_∞) Directly estimate K from F Use a calibration grid

Chapter 5: Rotating Cameras

When the camera rotates about its centre (no translation), there is no baseline, F is undefined, and standard reconstruction fails. But auto-calibration becomes easier!

Two views related by pure rotation are connected by a homography H = K R K⁻¹. From H and the constraint that K is the same:

H^T ω H = ω where ω = (KK^T)⁻¹

This is one of the most practical auto-calibration methods. A camera rotating on a tripod (like a panoramic stitching scenario) gives multiple homographies. Each homography provides 3 constraints on ω (from H^TωH = ω). Two rotations (3 views) suffice to determine ω for a camera with zero skew and square pixels. This is the basis of automatic panoramic stitching calibration.

For a camera rotating about its centre, two views are related by:

A homography H = KRK⁻¹ (no epipolar geometry, no F) A fundamental matrix F An affine transformation

Chapter 6: Auto-Calibration from Planes

Zhang's method: image a planar pattern (not a full 3D calibration object) from multiple viewpoints. The homography between the model plane and each image provides constraints on K.

Each view of a known metric plane provides 2 constraints on ω (from the circular points of the plane). With ω having 5 DOF for a general K, three views of the plane suffice. With zero skew assumed, two views suffice.

This is the most widely used calibration method in practice (the "Zhang method"). It requires only a printed checkerboard pattern — not a precision-machined 3D object. OpenCV's camera calibration implements this method.

Zhang's calibration method requires images of what?

A planar pattern (like a checkerboard) from multiple viewpoints A 3D calibration cube with known dimensions An arbitrary unknown scene

Chapter 7: Planar Motion

When a camera moves in a plane (translation in one plane + rotation about an axis perpendicular to it), additional constraints on F and hence on K arise. The symmetric part of F has rank 2, giving an extra constraint per view pair.

Application: ground vehicle cameras. A car-mounted camera undergoes planar motion (approximately). This provides extra constraints that improve auto-calibration robustness. Similarly, turntable motion (single-axis rotation) provides very strong constraints: K can be recovered from just 3 views of an unknown scene.

Turntable motion (rotation about a fixed axis) is a special case where the rotation angle between views can be computed, and K can be determined from the constraint that the rotation axis maps to a known vanishing point.

What special property of the fundamental matrix arises from planar motion?

The symmetric part of F has rank 2 (det F_s = 0) F is symmetric F is the identity

Chapter 8: Practical Issues

Issue	Guidance
Degeneracies	Pure rotation or planar scene causes F degeneracy; use homography-based methods instead
Varying K	Allow focal length to vary (zoom); keep skew=0 and aspect ratio=1 as constants
Critical motions	Some camera motions do not provide enough constraints (e.g., pure translation with constant K gives only an affine reconstruction)
Initialization	Auto-calibration is sensitive to initialization. Use the linear Q*_∞ or Kruppa solution as starting point, then refine with bundle adjustment
The practical approach	Assume zero skew, square pixels, principal point at image centre. This leaves only focal length unknown per camera. Very robust with 3+ views.

The most robust practical setup: Assume zero skew, square pixels, principal point = image centre. Then only focal length f is unknown per camera. Each view gives 1 constraint (from f alone). With 3 views this is (just barely) enough; more views give a robust over-determined system. This is what modern SfM pipelines do.

In the most common practical auto-calibration setup, which parameter is left unknown per camera?

Only the focal length f (with zero skew, square pixels, and principal point at image centre assumed) All 5 intrinsic parameters The skew parameter s

Chapter 9: Connections

Link	Connection
Ch 8 → Ch 19	The IAC ω from single-view geometry is the same ω used in auto-calibration constraints
Ch 10 → Ch 19	Stratified reconstruction upgrades projective → affine → metric; auto-calibration provides the metric upgrade
Ch 13 → Ch 19	H_∞ = KRK⁻¹ and rotating-camera homographies are key tools
Ch 18 → Ch 19	Auto-calibration constraints integrate into bundle adjustment for joint optimization

"Self-calibration: from the images alone, without the use of any calibration object, it is possible to determine the internal parameters of a camera."

— Hartley & Zisserman, Chapter 19

What geometric object unifies all auto-calibration approaches?

The image of the absolute conic ω = (KK^T)⁻¹, which depends only on K and is the same in all views when K is constant The fundamental matrix F The trifocal tensor T

← Chapter 18 Back to Book Index →