Hartley & Zisserman, Chapter 15

The Trifocal Tensor

The three-view analogue of the fundamental matrix. Tensor notation, point-line incidence, point transfer, and the relationship to fundamental matrices and camera matrices.

Prerequisites: Chapter 9 (Epipolar Geometry) + Chapter 11 (Computing F).

Chapters

Simulations

Chapter 0: Why Three Views?

The fundamental matrix captures the geometry of two views. But many reconstruction problems involve three or more views. Is there a geometric object that encodes the relationship between three views, just as F encodes two views?

Yes: the trifocal tensor T. It is a 3 × 3 × 3 array (27 entries) that encodes all the geometric relationships between three views. A point in one image and lines in the other two satisfy a trilinear relation through T.

What T gives you over F: A point correspondence across three views provides 4 independent constraints on T (vs. 1 constraint on F per two-view pair). This makes T estimation more constrained and more robust. T also enables point transfer: given a point in two images, predict its location in the third.

How many independent constraints does a three-view point correspondence provide on the trifocal tensor?

4 1 9

Chapter 1: The Geometric Basis

Consider three cameras with centres C, C', C'' and a 3D point X. The point X, together with the three camera centres, defines three epipolar planes (one for each pair of cameras). The trifocal tensor arises from the constraint that these three planes share the point X.

More concretely: a line l' in the second image back-projects to a plane π'. A line l'' in the third image back-projects to a plane π''. These two planes intersect in a 3D line L. The image of L in the first camera is a line l. The trifocal tensor computes l from l' and l'':

l_p = l'_q l''_r T^qr_p

The trifocal tensor maps pairs of lines to a line. Two lines in two different images determine a line in the third. This is the fundamental geometric operation, from which all other relations (point-point-point, point-point-line, etc.) are derived.

What does the trifocal tensor compute from two lines in two images?

The corresponding line in the third image The epipole The fundamental matrix

Chapter 2: Defining the Tensor T

For three camera matrices A, B, C (each 3×4), the trifocal tensor entry T^qr_i is:

T^qr_i = (−1)ⁱ⁺¹ det [a_l, a_m, b_q, c_r]

where a_l, a_m are the two rows of A obtained by deleting row i (for l < m), and b_q, c_r are individual rows of B and C.

DOF: The trifocal tensor has 27 entries, defined up to scale, so 26 DOF as a homogeneous object. But it arises from 3 camera matrices (33 DOF) minus a 15-DOF projective ambiguity = 18 DOF. So T must satisfy 26 − 18 = 8 internal constraints.

Correspondence type	# independent equations
Three points	4
Two points, one line	2
One point, two lines	1
Three lines	2

How many degrees of freedom does the trifocal tensor have?

18 27 7

Chapter 3: Point Transfer

Given a point x in image 1 and its correspondence x' in image 2, the trifocal tensor predicts the point x'' in image 3:

x''_k = xⁱ x'^j ε_jqu ε_krv T^qr_i

This is point transfer: no triangulation needed. The tensor directly maps (x, x') to x''.

Why is this useful? In view synthesis, you want to predict what a third camera would see. In correspondence search, if you have matches in two views, T predicts the location in the third — narrowing the search from a line (epipolar) to a point.

The transfer is exact: if x, x' correspond to the same 3D point, the predicted x'' is the exact projection. This is more powerful than the epipolar constraint (which only constrains x'' to a line).

Point transfer using the trifocal tensor predicts what from correspondences in two views?

The exact location of the corresponding point in the third view Only the epipolar line in the third view The 3D point position

Chapter 4: Line Incidence

The most fundamental relation involving T is the point-line-line incidence: if a point x in image 1 corresponds to points on lines l' and l'' in images 2 and 3:

xⁱ l'_q l''_r T^qr_i = 0

All other relations (three points, two points + line, three lines) are derivable from this basic one by substituting points for lines.

Line correspondences across three views DO constrain T (unlike the two-view case where line correspondences give no constraint on F). This is because three planes in 3-space do not generically intersect in a line — the coincidence constraint is meaningful.

Unlike in two views, do line correspondences across three views constrain the multi-view geometry?

Yes — three lines provide 2 independent constraints on T No — lines are still unconstrained Only in special cases

Chapter 5: Tensor Notation

The trifocal tensor uses index notation from tensor algebra. Key conventions:

Symbol	Type	Meaning
xⁱ	Contravariant	Point (column vector)
l_i	Covariant	Line (row vector)
T^qr_i	Mixed	Tensor: one covariant, two contravariant indices
ε_ijk	Levi-Civita	Alternating tensor (cross product)

Why tensor notation? It makes the symmetries and transformations explicit. When camera 1 contributes two rows (index i with omission) and cameras 2 and 3 each contribute one row (indices q and r), the asymmetry in T^qr_i reflects this. There are actually three different trifocal tensors, depending on which camera is "distinguished."

How many distinct trifocal tensors can be formed from three camera matrices?

3 (one for each choice of the distinguished camera that contributes two rows) 1 9

Chapter 6: Fundamental Matrices from T

The trifocal tensor contains the fundamental matrices for all three pairs of views. They can be extracted as:

Pair	Extraction
F₂₁ (views 1,2)	[e'']_× [T₁, T₂, T₃] e''
F₃₁ (views 1,3)	[e'']_× [T₁^T, T₂^T, T₃^T] e'
F₃₂ (views 2,3)	Derived from F₂₁ and F₃₁

where T_i are the 3×3 "slices" of the tensor, and e', e'' are the epipoles.

T is more informative than three F's. The trifocal tensor encodes 18 DOF. Three pairwise fundamental matrices encode 3 × 7 = 21 DOF, but with 3 consistency constraints, giving 18 independent DOF — exactly matching T. So T and the three F's carry the same information, but T enforces consistency automatically.

Can the fundamental matrices for all camera pairs be extracted from the trifocal tensor?

Yes — T encodes all three pairwise fundamental matrices No — T only relates three views, not pairs Only two of the three

Chapter 7: Computing T

The trifocal tensor can be computed from point and/or line correspondences across three views using methods analogous to the 8-point algorithm for F.

Method	Min correspondences	Notes
Linear (normalized)	7 points (or 13 lines)	Analogous to 8-point algorithm; normalize first
Algebraic minimization	7+	Enforce internal constraints
Geometric (Gold Standard)	7+	Minimize reprojection error via LM
RANSAC	7 samples	For outlier-contaminated data

Normalization and constraint enforcement matter just as much for T as for F. The linear algorithm estimates 26 DOF, but T has only 18 — so 8 internal constraints must be enforced. Ignoring them degrades accuracy significantly.

What is the minimum number of point correspondences across three views needed to compute the trifocal tensor?

7 13 18

Chapter 8: Properties and Constraints

Property	Detail
Size	3 × 3 × 3 = 27 entries
DOF	18 (= 3×11 − 15)
Internal constraints	8 (algebraic constraints on the entries)
Camera recovery	Camera matrices can be recovered from T up to projective ambiguity
Uniqueness	T is unique for a given set of three cameras (up to the projective ambiguity)

Affine trifocal tensor: If all three cameras are affine (last row (0,0,0,1)), then 11 of the 27 entries of T are zero. The affine tensor has only 16 non-zero entries (15 DOF up to scale).

How many internal algebraic constraints must the trifocal tensor satisfy?

8 (= 26 homogeneous entries − 18 DOF) 0 18

Chapter 9: Connections

Direction	Connection
F → T	T generalizes F to three views; it encodes all three pairwise F's
T → Q	The quadrifocal tensor (4 views, 3×3×3×3) further generalizes T; it has 29 DOF
T → Ch 18	T provides initialization for N-view bundle adjustment
T → Ch 19	T can be used for auto-calibration from three views

"The trifocal tensor may be thought of as a book of homographies, one for each epipolar plane."

— Hartley & Zisserman, Chapter 15

What is the four-view generalization of the trifocal tensor?

The quadrifocal tensor Q (3×3×3×3, 29 DOF) A larger fundamental matrix There is no four-view analogue

← Chapter 13 Chapter 18: N-View Methods →